许多涉及某种形式的3D视觉感知的机器人任务极大地受益于对工作环境的完整知识。但是,机器人通常必须应对非结构化的环境,并且由于工作空间有限,混乱或对象自我划分,它们的车载视觉传感器只能提供不完整的信息。近年来,深度学习架构的形状完成架构已开始将牵引力作为从部分视觉数据中推断出完整的3D对象表示的有效手段。然而,大多数现有的最新方法都以体素电网形式提供了固定的输出分辨率,这与神经网络输出阶段的大小严格相关。尽管这足以完成某些任务,例如导航,抓握和操纵的障碍需要更精细的分辨率,并且简单地扩大神经网络输出在计算上是昂贵的。在本文中,我们通过基于隐式3D表示的对象形状完成方法来解决此限制,该方法为每个重建点提供了置信值。作为第二个贡献,我们提出了一种基于梯度的方法,用于在推理时在任意分辨率下有效地采样这种隐式函数。我们通过将重建的形状与地面真理进行比较,并通过在机器人握把管道中部署形状完成算法来实验验证我们的方法。在这两种情况下,我们将结果与最先进的形状完成方法进行了比较。
translated by 谷歌翻译
动作识别是人形机器人与人类互动和合作的基本能力。该应用程序需要设计动作识别系统,以便可以轻松添加新操作,同时识别和忽略未知的动作。近年来,深度学习的方法代表了行动识别问题的主要解决方案。但是,大多数模型通常需要大量的手动标记样品数据集。在这项工作中,我们针对单发的深度学习模型,因为它们只能处理课堂的一个实例。不幸的是,一击模型假设在推理时,识别的动作落入了支持集中,当动作位于支持集外时,它们会失败。几乎没有射击开放式识别(FSOSR)解决方案试图解决该缺陷,但是当前的解决方案仅考虑静态图像而不是图像序列。静态图像仍然不足以区分诸如坐下和站立之类的动作。在本文中,我们提出了一个新颖的模型,该模型通过一个单发模型来解决FSOSR问题,该模型用拒绝未知动作的歧视器增强。该模型对于人体机器人技术中的应用很有用,因为它允许轻松添加新类并确定输入序列是否是系统已知的序列。我们展示了如何以端到端的方式训练整个模型,并进行定量和定性分析。最后,我们提供现实世界中的例子。
translated by 谷歌翻译
简介:在房颤(AF)导管消融过程(CAP)期间记录了12条铅心电图(ECG)。如果没有长时间的随访评估AF复发(AFR),确定CAP是否成功并不容易。因此,AFR风险预测算法可以使CAP患者更好地管理。在这项研究中,我们从CAP前后记录的12铅ECG中提取功能,并训练AFR风险预测机学习模型。方法:从112例患者中提取前和后段段。该分析包括信号质量标准,心率变异性和由12铅ECG设计的形态生物标志物(总体804个功能)。在112名患者中,有43例AFR临床终点可用。这些用于使用前或后CAP特征来评估AFR风险预测的可行性。在嵌套的交叉验证框架内训练了一个随机的森林分类器。结果:发现36个特征在区分手术前和手术后具有统计学意义(n = 112)。对于分类,报告了接收器操作特性(AUROC)曲线下的区域,AUROC_PRE = 0.64,AUROC_POST = 0.74(n = 43)。讨论和结论:此初步分析表明AFR风险预测的可行性。这样的模型可用于改善盖帽管理。
translated by 谷歌翻译
在随机子集总和问题中,给定$ n $ i.i.d.随机变量$ x_1,...,x_n $,我们希望将[-1,1] $ in [-1,1] $的任何点$ z \作为合适子集的总和$ x_ {i_1(z)},...,x_ {i_s(z)} $的$,最多$ \ varepsilon $。尽管有简单的陈述,但这个问题还是理论计算机科学和统计力学的基本兴趣。最近,它因其在人工神经网络理论中的影响而引起了人们的重新关注。该问题的一个明显的多维概括是考虑$ n $ i.i.d. \ $ d $ - 二维随机向量,目的是近似于[-1,1]^d $的每个点$ \ Mathbf {z} \。令人惊讶的是,在Lueker的1998年证明,在一维设置中,$ n = o(\ log \ frac 1 \ varepsilon)$ samples $ samples $ samples具有很高可能性的近似属性,在实现上述概括方面几乎没有进展。在这项工作中,我们证明,在$ d $ dimensions中,$ n = o(d^3 \ log \ frac 1 \ varepsilon \ cdot(\ log \ frac 1 \ frac 1 \ varepsilon + log d d))$ samples $ sample近似属性具有很高的概率。作为强调该结果潜在兴趣的应用程序,我们证明了最近提出的神经网络模型表现出\ emph {通用}:具有很高的概率,该模型可以在参数数量中近似多项式开销中的任何神经网络。
translated by 谷歌翻译
Computational units in artificial neural networks follow a simplified model of biological neurons. In the biological model, the output signal of a neuron runs down the axon, splits following the many branches at its end, and passes identically to all the downward neurons of the network. Each of the downward neurons will use their copy of this signal as one of many inputs dendrites, integrate them all and fire an output, if above some threshold. In the artificial neural network, this translates to the fact that the nonlinear filtering of the signal is performed in the upward neuron, meaning that in practice the same activation is shared between all the downward neurons that use that signal as their input. Dendrites thus play a passive role. We propose a slightly more complex model for the biological neuron, where dendrites play an active role: the activation in the output of the upward neuron becomes optional, and instead the signals going through each dendrite undergo independent nonlinear filterings, before the linear combination. We implement this new model into a ReLU computational unit and discuss its biological plausibility. We compare this new computational unit with the standard one and describe it from a geometrical point of view. We provide a Keras implementation of this unit into fully connected and convolutional layers and estimate their FLOPs and weights change. We then use these layers in ResNet architectures on CIFAR-10, CIFAR-100, Imagenette, and Imagewoof, obtaining performance improvements over standard ResNets up to 1.73%. Finally, we prove a universal representation theorem for continuous functions on compact sets and show that this new unit has more representational power than its standard counterpart.
translated by 谷歌翻译
Humans have internal models of robots (like their physical capabilities), the world (like what will happen next), and their tasks (like a preferred goal). However, human internal models are not always perfect: for example, it is easy to underestimate a robot's inertia. Nevertheless, these models change and improve over time as humans gather more experience. Interestingly, robot actions influence what this experience is, and therefore influence how people's internal models change. In this work we take a step towards enabling robots to understand the influence they have, leverage it to better assist people, and help human models more quickly align with reality. Our key idea is to model the human's learning as a nonlinear dynamical system which evolves the human's internal model given new observations. We formulate a novel optimization problem to infer the human's learning dynamics from demonstrations that naturally exhibit human learning. We then formalize how robots can influence human learning by embedding the human's learning dynamics model into the robot planning problem. Although our formulations provide concrete problem statements, they are intractable to solve in full generality. We contribute an approximation that sacrifices the complexity of the human internal models we can represent, but enables robots to learn the nonlinear dynamics of these internal models. We evaluate our inference and planning methods in a suite of simulated environments and an in-person user study, where a 7DOF robotic arm teaches participants to be better teleoperators. While influencing human learning remains an open problem, our results demonstrate that this influence is possible and can be helpful in real human-robot interaction.
translated by 谷歌翻译
Explainability is a vibrant research topic in the artificial intelligence community, with growing interest across methods and domains. Much has been written about the topic, yet explainability still lacks shared terminology and a framework capable of providing structural soundness to explanations. In our work, we address these issues by proposing a novel definition of explanation that is a synthesis of what can be found in the literature. We recognize that explanations are not atomic but the product of evidence stemming from the model and its input-output and the human interpretation of this evidence. Furthermore, we fit explanations into the properties of faithfulness (i.e., the explanation being a true description of the model's decision-making) and plausibility (i.e., how much the explanation looks convincing to the user). Using our proposed theoretical framework simplifies how these properties are ope rationalized and provide new insight into common explanation methods that we analyze as case studies.
translated by 谷歌翻译
Fruit is a key crop in worldwide agriculture feeding millions of people. The standard supply chain of fruit products involves quality checks to guarantee freshness, taste, and, most of all, safety. An important factor that determines fruit quality is its stage of ripening. This is usually manually classified by experts in the field, which makes it a labor-intensive and error-prone process. Thus, there is an arising need for automation in the process of fruit ripeness classification. Many automatic methods have been proposed that employ a variety of feature descriptors for the food item to be graded. Machine learning and deep learning techniques dominate the top-performing methods. Furthermore, deep learning can operate on raw data and thus relieve the users from having to compute complex engineered features, which are often crop-specific. In this survey, we review the latest methods proposed in the literature to automatize fruit ripeness classification, highlighting the most common feature descriptors they operate on.
translated by 谷歌翻译
Graph Neural Networks (GNNs) achieve state-of-the-art performance on graph-structured data across numerous domains. Their underlying ability to represent nodes as summaries of their vicinities has proven effective for homophilous graphs in particular, in which same-type nodes tend to connect. On heterophilous graphs, in which different-type nodes are likely connected, GNNs perform less consistently, as neighborhood information might be less representative or even misleading. On the other hand, GNN performance is not inferior on all heterophilous graphs, and there is a lack of understanding of what other graph properties affect GNN performance. In this work, we highlight the limitations of the widely used homophily ratio and the recent Cross-Class Neighborhood Similarity (CCNS) metric in estimating GNN performance. To overcome these limitations, we introduce 2-hop Neighbor Class Similarity (2NCS), a new quantitative graph structural property that correlates with GNN performance more strongly and consistently than alternative metrics. 2NCS considers two-hop neighborhoods as a theoretically derived consequence of the two-step label propagation process governing GCN's training-inference process. Experiments on one synthetic and eight real-world graph datasets confirm consistent improvements over existing metrics in estimating the accuracy of GCN- and GAT-based architectures on the node classification task.
translated by 谷歌翻译
In recent years, reinforcement learning (RL) has become increasingly successful in its application to science and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever more challenging. In this work, we gain insights into an RL agent's learned behavior through a post-hoc analysis based on sequence mining and clustering. Specifically, frequent and compact subroutines, used by the agent to solve a given task, are distilled as gadgets and then grouped by various metrics. This process of gadget discovery develops in three stages: First, we use an RL agent to generate data, then, we employ a mining algorithm to extract gadgets and finally, the obtained gadgets are grouped by a density-based clustering algorithm. We demonstrate our method by applying it to two quantum-inspired RL environments. First, we consider simulated quantum optics experiments for the design of high-dimensional multipartite entangled states where the algorithm finds gadgets that correspond to modern interferometer setups. Second, we consider a circuit-based quantum computing environment where the algorithm discovers various gadgets for quantum information processing, such as quantum teleportation. This approach for analyzing the policy of a learned agent is agent and environment agnostic and can yield interesting insights into any agent's policy.
translated by 谷歌翻译